Scanning and Parsing Languages with Ambiguities and Constraints: The Lamb and Fence Algorithms
نویسندگان
چکیده
Traditional language processing tools constrain language designers to specific kinds of grammars. In contrast, model-based language processing tools decouple language design from language processing. These tools allow the occurrence of lexical and syntactic ambiguities in language specifications and the declarative specification of constraints for resolving them. As a result, these techniques require scanners and parsers able to parse context-free grammars, handle ambiguities, and enforce constraints for disambiguation. In this paper, we present Lamb and Fence. Lamb is a scanning algorithm that supports ambiguous token definitions and the specification of custom pattern matchers and constraints. Fence is a chart parsing algorithm that supports ambiguous context-free grammars and the definition of constraints on associativity, composition, and precedence, as well as custom constraints. Lamb and Fence, in conjunction, enable the implementation of the ModelCC model-based language processing tool.
منابع مشابه
تأثیر ساختواژهها در تجزیه وابستگی زبان فارسی
Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...
متن کاملA Constraint-Satisfaction Parser for Context-Free Grammars
Traditional language processing tools constrain language designers to specific kinds of grammars. In contrast, model-based language specification decouples language design from language processing. As a consequence, model-based language specification tools need general parsers able to parse unrestricted context-free grammars. As languages specified following this approach may be ambiguous, pars...
متن کاملFence - An Efficient Parser with Ambiguity Support for Model-Driven Language Specification
Model-based language specification has applications in the implementation of language processors, the design of domain-specific languages, model-driven software development, data integration, text mining, natural language processing, and corpus-based induction of models. Model-based language specification decouples language design from language processing and, unlike traditional grammar-driven ...
متن کاملA Lexical Analysis Tool with Ambiguity Support
Lexical ambiguities naturally arise in languages. We present Lamb, a lexical analyzer that produces a lexical analysis graph describing all the possible sequences of tokens that can be found within the input string. Parsers can process such lexical analysis graphs and discard any sequence of tokens that does not produce a valid syntactic sentence, therefore performing, together with Lamb, a con...
متن کاملParsing Languages with a Configurator
Recent evolution of linguistic theories heavily rely upon the concept of constraint. Also, several authors have pointed out the similitude existing between the categories of feature-based theories and the notions of objects or frames. We show that a generalization of constraint programs called configuration programs can be applied to natural language parsing. We propose here a systematic transl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1501.02795 شماره
صفحات -
تاریخ انتشار 2015